On the minimum FLOPs problem in the sparse Cholesky factorization
نویسندگان
چکیده
Prior to computing the Cholesky factorization of a sparse, symmetric positive definite matrix, a reordering of the rows and columns is computed so as to reduce both the number of fill elements in Cholesky factor and the number of arithmetic operations (FLOPs) in the numerical factorization. These two metrics are clearly somehow related and yet it is suspected that these two problems are different. However, no rigorous theoretical treatment of the relation of these two problems seems to have been given yet. In this paper we show by means of an explicit, scalable construction that the two problems are different in a very strict sense. In our construction no ordering, that is optimal for the fill, is optimal with respect to the number of FLOPs, and vice versa. Further, it is commonly believed that minimizing the number of FLOPs is no easier than minimizing the fill (in the complexity sense), but so far no proof appears to be known. We give a reduction chain that shows the NP hardness of minimizing the number of arithmetic operations in the Cholesky factorization.
منابع مشابه
Efficient Sparse Cholesky Factorization on a Massively Parallel SIMD Computer
We investigate the effect of load balancing when performing Cholesky factorization on a massively parallel SIMD computer. In particular we describe a supernodal algorithm for performing sparse Cholesky factorization. The way the matrix is mapped onto the processors has significant effect on its efficiency. We show that this assignment problem can be modeled as a graph coloring problem in a weig...
متن کاملFast Sparse Matrix Factorization on Modern Workstations
The performance of workstation-class machines has experienced a dramatic increase in the recent past. Relatively inexpensive machines which offer 14 MIPS and 2 MFLOPS performance are now available, and machines with even higher performance are not far off. One important characteristic of these machines is that they rely on a small amount of high-speed cache memory for their high performance. In...
متن کاملImproving Performance of Hypermatrix Cholesky Factorization
This paper shows how a sparse hypermatrix Cholesky factorization can be improved. This is accomplished by means of efficient codes which operate on very small dense matrices. Different matrix sizes or target platforms may require different codes to obtain good performance. We write a set of codes for each matrix operation using different loop orders and unroll factors. Then, for each matrix siz...
متن کاملImplementing a parallel matrix factorization library on the cell broadband engine
Matrix factorization (or often called decomposition) is a frequently used kernel in a large number of applications ranging from linear solvers to data clustering and machine learning. The central contribution of this paper is a thorough performance study of four popular matrix factorization techniques, namely, LU, Cholesky, QR, and SVD on the STI Cell broadband engine. The paper explores algori...
متن کاملA Scalable Parallel Algorithm for Sparse MatrixFactorization
In this paper, we describe a scalable parallel algorithm for sparse matrix factorization, analyze its performance and scalability, and present experimental results of its implementation on a 1024-processor nCUBE2 parallel computer. Through our analysis and experimental results, we demonstrate that our algorithm improves the state of the art in parallel direct solution of sparse linear systems b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- SIAM J. Matrix Analysis Applications
دوره 35 شماره
صفحات -
تاریخ انتشار 2014